Constraint Categorial Grammars 



Luis Damas, Nelma Moreira 
{luis,nam}@ncc. up.pt 

LIACC, Universidade do Porto 
Rua do Campo Alegre 823, 4150 Porto, Portugal 



Abstract. Although unification can be used to implement a weak form 
of /3-reduction, several linguistic phenomena are better handled by us- 
ing some form of A-calculus. In this paper we present a higher order 
feature description calculus based on a typed A-calculus. We show how 
the techniques used in CCQ for resolving complex feature constraints can 
be efficiently extended. CCCQ is a simple formalism, based on categorial 
grammars, designed to test the practical feasibility of such a calculus. 
Keywords: constraint satisfaction, computational semantics, high-order 
programming. 



1 Introduction 

Unification based formalisms show a clear inability to deal in a natural way 
with phenomena such as the semantics of coordination, quantification scoping 
ambiguity or bound anaphora. As a matter of fact, although unification can be 
used to implement a weak form of /3-reduction, it seems that this kind of phe- 
nomena is better handled by using some form of A-calculus [DSP91, Per90]. One 
possibility, which is at the heart of systems like AProlog[NM88], is to extend 
both the notion of term, to include A-abstraction and application, and the def- 
inition of unification to deal with A-terms. For this extension to be technically 
sound it is necessary to require A-terms to be well typed. On the other hand, 
it turns out that if instead of using terms we use complex feature descriptions 
(where conjunction replaces unification), we still can follow the same plan to 
produce a higher-order calculus of feature descriptions. CCCQ is a simple for- 
malism, based on categorial grammars, designed to test the practical feasibility 
of such an approach. The main reason for selecting a categorial framework for 
this experiment was that, due to the simplicity of the categorial framework, it 
allowed us to concentrate on the constraint calculus itself. Another reason was 
also the close historical relationship between categorial grammars and semantic 
formalisms incorporating A-abstraction. CCCQ extends categorial grammar by 
associating not only a category but also a higher-order feature description with 
each well-formed part of speech. The type of these feature descriptions are de- 
termined by the associated category. Note also that a derivation leading to an 
unsatisfiable feature description is legal. When compared with other formalisms 
(for instance, [ZKC87]) one of the main distinguishing features of CCCQ is the 
fact that it computes partial descriptions of feature structures and not the fea- 
ture structures themselves. 



It is important to notice that this calculus is easily modified to deal with 
constraints over finite or rational trees, instead of feature trees. Also, the ad- 
vantages of this kind of calculus, namely its decidability, over the use of general 
high-order logic programming systems, for processing semantic representations 
in NLP systems should be obvious. The rest of the paper proceeds as follows. We 
start by defining a feature description calculus as an hybrid of A-calculus and fea- 
ture logics and we present its denotational semantics. In section 2.1 we describe 
a complete constraint solver for higher-order feature descriptions. In section 3 
we define constraint categorial grammars and briefly present its implementation. 
Some final remarks are considered in section 4. 

2 Feature Description Calculus 

The feature description calculus Ajr V at the heart of our formalism is inspired 
both on the A-calculus and on feature logics [Smo89, ST92] . For technical reasons, 
namely that we want to ensure the existence of normal forms, it is a typed 
calculus. Our base types are bool for truth values and fs for feature structures. 
Our types are described by 

t ::= bool | fs — > t | t — > r' 

Note that we exclude fs as the type of any feature description. This reflects our 
commitment to compute partial descriptions of feature structures rather than 
feature structures. 

Now assume we are given a set of atoms a, b, . . . , a set of feature symbols /, 
g, . . . , a set of feature structure variables x, y, . . . , and, for each type r, a set of 

variables of type r x T , y r , Then the set of feature descriptions of type r is 

described by 

e T ::= true | false | x T \ e T A e T \ e T V e T | ->e T | ef s ^ T x.p | e/ s ^ T a | e T >^ T e T > 
Cbool ■■= t.p=s \ t = s 

G j- g — — hoc .C-J- 

GrJ- I y -J- a ■ " ^\tfj -J- ' . G-J- 

where s and t denote either atoms or feature structure variables, and p is a, pos- 
sibly empty, sequence of feature symbols denoting a path in a feature structure. 

Note that the language thus defined includes both feature logics and a typed 
A-calculus. 

We import from both theories such notions as substitution, free and bounded 
occurrences of variables, a— and (i— reductions and /3a-normal form. In partic- 
ular, a closed feature description is a feature description with no free variables. 
Moreover, feature constraints of feature logics, widely used in unification gram- 
mars, correspond to a subset of feature descriptions of type bool, without ab- 
stractions or applications. 



To define a semantics for the calculus of feature descriptions we adopt the 
standard model VT of rational trees for feature structures (see [DMV94]) and 
associate with each type r a semantic domain D T as follows 



A,ool - {0, 1} 
D ts ^ T = VT^D T 
D T ,^ T = D T i -> D T 

From this point on a semantics for feature descriptions is defined in the same 
way as for feature logics and the typed A-calculus by noting that the standard 
boolean operations can be extended to all the semantic domains involved in a 
component wise fashion, e.g. 

(Xx.e) V (Xx.e') = def (Xx.e V e). 

More precisely, let an assignment p be a mapping defined on variables, such 
that p{x) G VST and p(x T ) G D T , for each type r. As usual, p[d/a) denotes the 
assignment obtained from p by mapping a to d. Let f nT , p nr and a nr denote 
the interpretation of features, paths and atoms in VT, respectively. Furthermore, 
let t nT p be p(t) if t is a variable and t nr otherwise. Then, the semantics of 
feature descriptions A^d given an assignment p is defined inductively, as follows: 

\x r \p =p(x T ) [Ax.e T ]p = Wv.le T }p[v/x] (v G VT) 

(1 if P ™-{t* r p)=s-* T p \Xx T .e' T \p =Ml<]p[.M (veV T ) 
P-P -HP S Q othermse [e ft -rrx.p]p = {[e fa ^ T }p)[x.p]p 

U = tin = J 1 lf tUTp = sUTp l^ajp = ([e /s _]p)a^ 

P S|P \0 otherwise twej-jp = (le T ^\p)K\p 

where A\ denotes function "abstraction" in set theory and (x G D) means that 
D is the domain of x. For the conjunction operation, we define: 

1 if \ebooi\p = 1 and \e' bool \p = 1 



it A j t / 1 if l e booljp 

le b ooiAe bo Jp =| Q otherwise 



[e/ s ^ T A e' /s ^ r ]p = X\v.d A d! where [e/ s ^ T ]p = X\v.d (v G 7?.T) 
" ' ¥ fs ^ T }p = AW (v G 72.T) 

[e T /^ r A e^_,^ T ]/9 = A\v.e? A d' where [e r /_, r ]p = AXu.d (v G P r y 

K,^ T ]p= W(u G 2V) 

and analogously for the other boolean operations. If Ibooi = 1, = Ai>.l r , 

and l T '^ r = Av.l r then the semantics of true, for each type r is defined by: 

[true hooi ]p = l booU 
[true /8 _» T ]p = \v.l T (v e VST) 
[true T ^ r ]p — Xv.l T (v G V T ) 

and analogously for false. 

An important property of the feature description calculus is the existence of 
normal form under /3-reduction which is a simple consequence of well-typedness. 
Another important property is that for any closed feature description of type 



r we can decide if it is equivalent to false. This last property is essentially an 
extension of the satisfiability problem for a complete axiomatization of feature 
logics. For this reason we will say that a feature description of type r is satisfiable 
iff its semantics is not that of false. 

2.1 Constraint Solver 

Our implementation of the feature description calculus is based on the reduc- 
tion to normal form followed by the techniques used in CCQ [DV92, DMV94] 
for resolving complex feature constraints. In order to face the NP-hardness of 
the satisfiability problem, our approach was based in factoring out, in poly- 
nomial time, deterministic information contained in a complex constraint and 
simplifying the remaining formula using that information. The deterministic in- 
formation corresponds to a conjunction of (positive) atomic constraints in solved 
form 1 , which we denote by A4. We say that Ai is a partial model of C if and 
only if every model of C is a model of M. . When every model of M is a model 
of C, but no proper subset of Ai satisfies this condition, we will say that Ai 
is a minimal model of C. By using disjunctive forms it can be proved that any 
set of feature constraints C admits at most a finite number of minimal models 2 . 
In [DV92, DMV94, Mor95] a rewrite system was presented that from a com- 
plex feature constraint Co produces a pair (Ai,C), where Ai is solved form, C is 
smaller than Co and such that VST \= Co <-> Ai AC and any minimal model of 
Co can be obtained by conjoining a minimal model of C with Ai. Moreover the 
rewriting system is complete in the sense that Ai A C is satisfiable, unless it 
produces false as the final model. 

We now extend that rewrite system to higher-order feature descriptions. First 
we give some more characterizations of feature descriptions. A basic normal 
description of type r is described by: 

e T ::= true | false | x T \ x T A e T | x T V e T ^e T | Xf s ^ T x.p \ Xf s ^ T a | x T '^ T e T > 
ebooi ■■= t.p=s | t=s 

6 g — .. — ^tZ/.C-j- 

Then, every closed feature description in basic normal form will be of the form 
\x a .ebooi where x a denotes a sequence of bound variables of some types and 
ebooi is not an abstraction. Omitting the A prefix, given a feature description of 

1 A conjunction of feature constraints M is a solved form if: 

1. every constraint in A4 is of the form x.f=s or x — s 

2. if x — s is in A4 then x occurs exactly once in M 

3. if x.f=s and x.f=t are in M then s = t 

Any conjunction of atomic constraints is satisfiable if and only if it can be reduced 
to a solved form [Smo89, Mah88, DMV94] 

2 Actually it is necessary to extend the notion of models to include negative atomic 
constraints, but that will not be addressed here. 



type bool, e boo i, the solver will produce a partial model M and a smaller feature 
description e' bool or false: 



(M, e boo i A false) -> (_L, false) 

(M,e boo i A true) -> (X 

(M,e bool A s = i) -> (X A s = t, e bool ) 

(M,e bool A t.p=s) (MA t.p=s, e bool ) 

(M,e bool ) -> (M,e' bool ) if e bool — >* M e' bool 

with the convention that after each application of one of the rewrite rules the 
new partial model is reduced to solved form (or false). The complete rewrite 
system — >m is: 



(Xx.e T )x.p 
(Xx.e T )a 
(\x T i .e T )e' T , 



x.p/x] -i(Xx.e) — >m Xx.^e 
a/x] (Xx.e) A (Xx.e 1 ) — >m Xx.eAe' 
e T [e' T ,/x T '] (Xx.e) V (Xx.e') — >m Xx.eVe 



(1) 



e T A x T — >m x T A e T if e r is not a variable 

e T A (x T A e' T ) — >m %t A (e T A e' T ) if e T is not a variable 

{e T Ae' T ) Ae'l — > M e T A (e' T A e") 

e T V x T — >m x T V e T if e T is not a variable 

e T V (x T V e' T ) — >m x T V (e r VeJ.) if e T is not a variable 

(e T v e ;)v e ;' e T v( e ;ve;') 

(i/s^Ae/s^ji.p — >A4 (*^fs — *r)x.p A (ef 8 — tT )x.p 

{xf s ^ T A e fs ^ T )a > M (Xfs^r)a A (ef s ^ T )a 

{x T >-, T A e' T ,^ T )e T — > M (x T >-, T )e T A (e' T ,^ T )e T 

(i/ s ^Ve/„)i.j) — > M (xf s ^ T )x.p V (e fs -> T )x.p 

(Xf s ^ T V e fs ^ T )a — > M (xf s ^ T )a V (ef s ^ T )a 

(x T >^ T V e' T ,^ T )e T — > M (av_> T )e T V (e' T ,^ T )e T 



Xx.e T 

Xx t * • 

(x T i -^ T e T ) 



>M Xx.e' T if e T 
*M Xx T >.e' T if e T 
>M (x T '^ T e' T ) if e T 



M ' 

* 



(3) 



false f s ^ r x.p 
falsef s ^ T a 

j~ Oil SG--J- 1 — yj- G-f 1 

truef s ^ T x.p 
true f s ^ T a 
true T '^ T e T i 
false T A e T 



false T 

m false T 

M false r 

M true T 

M true T 

M true T 

M false T 



true T A e T 
e T A false T 
e T A true T 
false T V e T 
true T A e T 
e T V false T 
e T A true T 



>M falser 

> M true T 
>m true T 



(4) 



e T 
x 

a.p^s 
x.p=s 
a = b 
t = t 

x = t A e boo i 
x = t A ebooi 
x.p=t A e boo i 
x.p=t A e boo i 



false T 



false 
t = s 
M false 



true 

x = t A e 
false 



bool 



x.p=t/\e' b 



false 



bool 



\ix = teM 
if x.p^t e M 



ifMAs = t ^ 1 

II Sbool ^Ax.p^t 

if M A x.p = t — > ± 



J bool 



J bool 



(5) 



(e r V e^.) A e" — >^ (e r A e") V (e^. A e") if both e T and e' T are A^- , r n 

dependent with e" 

We assume that a-reductions will be performed whenever necessary. For sim- 
plicity we omitted the rules concerning negation. The rewrite system is divided 
in six groups, each one dealing with: (1) /3-reduction (where e[<i/x] denotes the 
substitution in e of x for d), abstraction and boolean operations for higher order 
types; this rules are applied before any other rule (2) application and boolean 
operations (3) rewrite inside abstractions and applications (4) false and true; 
(5) feature description of type bool, ebooi', this rules essentially correspond to 
the feature constraint rewrite system in [DMV94] (6) distributive law; this rule 
must apply only when both e T and e' T have variables in common with e", even- 
tually through "bindings" in M 3 . If this last rule is omitted, the rewrite process 
becomes polynomial although incomplete. 

Theorem 2.1 Given a closed feature description e T the rewrite system is cor- 
rect, terminating and complete in the sense that e T is satisfiable unless false is 
produced. Moreover the final feature description is in basic normal form. 

For a proof of the above results see [Mor95]. 



3 Constraint Categorial Grammar 

In this section we show how the expressiveness of categorial grammars can be 
augmented using feature descriptions. 

We will use a basic (rigid) categorial grammar (CG), consisting of a set of 
categories, a lexicon which assigns categories to words and a calculus which 
determines the set of admissible category combinations. Given a set of basic 
categories Cato we define recursively the set of categories Cat by: the elements of 

3 The notation of .M-dependence coincides with the one for complex feature constraints 
[DMV94], if x £ c means x occurs free in c. Given two constraints ci and C2 and a 
model M, ci and C2 are M-dependent if and only if Varjvt(ci)nVarM (C2) 7^ 0, where 
VarM(c) is the smallest set satisfying: if x € c, then x G Varj^i(c); if x G VarM{c) 
and x.f = z € M, then 2 G Varw(c). 



Cato are categories; if A and B are categories then A/B and A\B are categories. 
Some unary (lexical) rules (lifting, division, etc) will be added to provide a 
flexible CG which can cope with discontinuity and other linguistic phenomena. 
Semantically these rules allow functional abstraction over displaced or missing 
elements. 

A Constraint Categorial Grammar is a tuple < Cat ,T 7 Lexicon, Rules > 
where 

1 . Cat is a set of base categories 

2. T is a map which associates with each category C a type T(C) and satisfies 



3. Lexicon is a set of triples < to, A, c >, where to is a word, A a category and 
c is a feature description of type T(A) 

4. Rules is the set of inference rules to combine pairs A — c of syntactic cate- 
gories and feature descriptions (semantic representation). 

The inference rules used in the current grammars are: 



plus a set of unary rules. 
3.1 A sample grammar 

In figure 1. is given a fragment of an English grammar written in CCCQ . We 
use '\' for 'A', '&' for 'A' and '|' for 'V". All variables are bound and can be any 
string of letters. The let constructor allows the use of macros in the writing 
of the lexicon. The transformation constructor implements unary rules for 
type raising. Type raising rules are just allowed for some categories and their 
application is controlled during execution. The lex constructor is used for each 
lexical entry. In this experiment we do not impose any type discipline (HPSG 
style) in the feature structures themselves 4 . If we assign to each part of speech 
a feature structure, then an associated feature description will be of type fs — > 
. . . — ► bool. For instance, if we assign the type fs — ► bool to "John" , with 
semantics Xs.s — john, and assign the type fs — ► fs — > bool to "runs", with 
semantics Xx.Xs.s.reln — run A s.argl = x, the sentence "John runs" would 
have the type fs — > bool and semantics Xs.s.reln — run A s.argl — john. Once 
more we note that the use of partial descriptions allows us to express directly, 
the relations between the several constituents. The semantic used is inspired in 
the ones in [PS87]. 



T(A/B) = T(B\A) = T(B) T(A) 



(app/) 
(app\) 




if CfCb is satisfiable 
if CfCt, is satisfiable 



4 Neither the distinction between "syntactic" and "semantic" features is made. 



Base-Categories % Define the set of base categories 

s = fs— >bool, % and their types 

iv = fs— >fs— >bool, 

np = s/iv, 

tv = iv/np, 

dv = tv/np, 

n = fs— >fs— >bool, 

det = np/n, 

pp= fs— > bool; 
transformation % define a type raising rule 

np = (s/np)/ (iv/np) : \S \Vt \C. S (Vt C); 
%%%%%%%% some useful abbreviations 
% agreement specifications 
let 3RDJ3G = \X. X.pers=p3 & X.nb=sg; 
let NOT_3RD_SG = \X. X.pers\=p3 | X.nb\=sg; 
let ANY = \X. X=X; 

% proper nouns (generalized quantifier type) 

let PN(W) = \P.\s. s.quant=exists_one & s.arg.reln=naming 

& s.arg.argl=W & 3RD_SG(s.arg) & P s.arg s.pred ; 
% common nouns (AGR is an agreement) 
let CN(W,AGR) = \s. s.reln=W & s.argl=x & AGR s; 
%determiners 

let DET(Q,AGR) = \N. \P. \s. s.quant=Q & AGR s.var & 

N s.var s. range & P s.var s. scope; 
% intransitive verbs 

let IV(W,AGR) = \s.\p. p.reln=W & p.argl=s & AGR s; 
% transitive verbs (Obj is the semantics of the object) 

let TV(W,AGR) = \Obj. \su.\p. Obj (\o \q. q.reln=W & q.argl=su & q.arg2=o 
let V_PP(W,AGR) = \SS. \su \ s. SS s.arg2 & s.reln=W & s.argl=su & AGR su 
%ditransitive verbs 

let DV(W,AGR) = \Ci. \Cs. \subj. \si. Cs ( \ind. \s. Ci ( \obj \p. p.reln=W & 
p.argl=subj& p.arg2=obj & p.arg3=ind s) si & AGR subj; 

%%%%%%%%%%%%%% lexicon 

lex a, det, DET(exists_one,3RD_SG); lex every, det, DET(all,ANY); 

lex book, n, CN(book,3RD_SG); lex man, n, CN(book,3RD_SG); 

lex john, np, PN(john); lex mary, np, PN(mary); 

lex died, iv, IV(die,3RD_SG); lex loves, tv, TV(love,3RD_SG); 

lex read, tv, TV(read,ANY); lex said, iv/pp, V_PP(say,ANY); 

lex gave, dv, DV(give,ANY); 

lex that, pp/s, \s.s; 

% coordination lex and, s\(s/s), \Sl\S2\s. s.type=coord & SI s.argl & S2 s.arg2; 
lex and, np\((tv\iv)/np), \NP1\NP2\VT. \subj\s. s.type=coord & 

VT NP1 subj s.argl & VT NP2 subj s.arg2; 
lex and, iv\(iv/iv), \V1\V2. \subj.\s. 

s.type=coord & VI subj s.argl & V2 subj s.arg2; 
lex and, np\(np/np), \NPl\NP2\VT\s. s.type=coord & 

NP1 VT s.argl & NP2 VT s.arg2; 



Fig. 1. Sample grammar 



Processing CCCQ is implemented in Prolog augmented with the constraint 
solver for feature descriptions 5 . In this section we briefly describe this imple- 
mentation. Although the feature descriptions used in the grammar are untyped, 
a type inference algorithm is used to infer types for each expression. Moreover, 
for each lexical entry the type of the feature description is checked with that of 
the category and whenever possible the normal form of the feature description is 
computed. The inference rules are build-in in the grammar processor. Currently, 
we use a bottom-up chart parser that builds a context-free backbone. Each edge 
is a (Prolog) term arc(Begin, End, Cat, Sref) where Cat is the category span- 
ning from Begin to End and Sref is the information to be used to extract the 
semantic representation, and that reflects how this edge was formed: if it was a 
lexical entry Sref is a reference to it; if it results from a left (right) application 
rule, it is a pair of references for its daughters; if it results from a unary rule, it is 
a pair of references to the initial category and to that rule. When the parse trees 
are successful built, the semantic representation is extracted and the constraint 
solver applied. These two components can be interleaved in order to prune, as 
soon as possible, inconsistent edges. As is apparent from the sample grammar 
(figure 1.) the semantic representations can become very cumbersome to write 
and visualize. So a graphical "workbench", based on a Tcl/Tk interface to Yap 
Prolog, was provided to edit grammars and lexicon, as well as to visualize the 
parse trees and semantic representations (as matrix boxes). 

An Example As an example we analyze the parsing of the sentence "a man 
said that john read a book and mary died". There are two possible parse 
trees of this sentence, one with the coordination in the scope of the relative clause 
and other with a wider scope. The semantic representation of this sentence will 
be a feature description \x_l.Xl|X2 where XI and X2 are partially represented in 
figures 2. and 3.. Figures 4. and 5. show the semantics of the sentences "john 
read a book" and "mary died", respectively. In the feature description X2 (fig- 
ure 3.) the former semantics is identified with the value of x_l . argl . scope . arg2 
and the latter is identified with the value of x_l . arg2. In the feature description 
XI (figure 2.) the value of x_l . scope . arg2 is the feature structure correspond- 
ing to the coordination of the these two sentences. As remarked in the previous 
section, the parsing process first builds a parse forest using only the categories 
of lexical items and the inference (and unary) rules for syntactic categories. The 
parse tree of sentence we are considering is too large to be considered here, so 
figure 6. shows only the parse forest of "john read a book". In the first row 
we have the syntactic categories of each lexical item (given in the lexicon or 
derived by a unary rule). In the following rows each entry corresponds to the 
possible ways of deriving a category spanning a portion of the input sentence. 
For instance, the category iv can be derived in the third row from iv/(s/ iv) and 
s/iv, spanning "read a book". Then for each parse tree that spans the whole 
sentence with root category s, the semantic representation of the constituents 

5 So it can be seen as an instance of CLT{Att>) ■ 



are combined and if the constraint solver does not produce false, a semantic 
representation is derived. 
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Fig. 2. Coordination inside relative Fig. 3. Coordination wider scope 



4 Final Remarks 

The current implementation of CCCQ shows the practical feasibility of using 
higher order feature structure descriptions as semantic representations. This re- 
flects the fact that the complexity of the satisfiability problem for higher or- 
der feature descriptions is essentially the same as for feature logics. We should 
also point out that the good performance of the system results in part from 
its hybrid nature where a categorial grammar with atomic base categories is 
used to guide parsing. Some more toy English grammars where written that 
can handle some kinds of discontinuity, modifiers and quantifier scope. How- 
ever, the introduction of a type discipline and more general treatment of recur- 
sive lexical rules ([BvN94]) must be considered, in future work. On the other 
hand, most recent developments of categorial grammars are based on the Lam- 
bek calculus [Lam58, M0088, Mor94] (an intuitionist fragment of Linear Logic). 
Some implementations for the propositional fragment are based on chart parsers 
[Kon94, Hep92] and we conjecture that Ajr-p calculus can be successfully used in 
such a systems, for process semantic representations. From an implementational 
perspective it would be helpful to study how current techniques employed in 
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Fig. 4. Semantics of "john read a book" . Fig. 5. Semantics of "mary died" . 
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Fig. 6. A parse forest 



functional programming implementations, namely the use of combinators, can 
be imported for improve the computation of /3-rcductions. 
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